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ABSTRACT 

Cosmic rays are a known problem in astronomy, causing both loss of data and data inaccuracy. 
The problem becomes even more extreme when considering data from a high-radiation environment, 
such as in orbit around Earth or outside the Earth's magnetic field altogether, unprotected, as will 
be the case for the James Webb Space Telescope (JWST). For JWST, all the instruments employ 
nondestructive readout schemes. The most common of these will be "up the ramp" sampling, where 
the detector is read out regularly during the ramp. We study three methods to correct for cosmic rays 
in these ramps: a two-point difference method, a deviation from the fit method, and a y-intercept 
method. We apply these methods to simulated nondestructive read ramps with single-sample groups, 
and varying combinations of flux, number of samples, number of cosmic rays, cosmic-ray location in 
the exposure, and cosmic-ray strength. We show that the y-intercept method is the optimal detection 
method in the read-noise-dominated regime, while both the y-intercept method and the two-point 
difference method are best in the photon-noise-dominated regime, with the latter requiring fewer 
computations. 

Subject headings: Data analysis and Techniques, Astrophysical Data, Astronomical Techniques 



1. INTRODUCTION 

Advances in technology that allow us to observe fainter 
objects, build more complex systems, and send telescopes 
further into space have challenged us to continue to im- 
prove our calibrations. This includes detection methods 
for cosmic rays (CRs) or any charged particle that adds 
a jump to the data. JWST^s orbit at the second Earth- 
Sun Lagrange point (L2), which allows passive cooling 
of the telescope to ^^50 K, puts it outside the protec- 
tive mantel of the Earth's magnetic field. This could 
make CRs a serious problem. Furthermore, newer in- 
frared telescopes will have a lower read-noise, and thus 
we will detect lower CRs. Finally, long observing times 
will be necessary to complete many of the scientific goals 
of the JWST, ca using CRs to be a n even larger problem. 
From a study bv lRobbertol (|2009aD we can calculate that 
in every 2000 s, on average, 13% of the pixels on the 
JWST HgCdTe detectors and 25% of the pixels on the 
JWST Si:As detectors will be affected by CRs. These 
values could be even greater, since this study does not 
take into account secondary particles. For comparison, 
onboard measurements of the NICMOS camera on the 
HST show that about 10% of the pixels show a CR hit 
for ev ery 2000 s of integration (jViana and Wiklind et all 
l2009f ). These CRs will include low-energy CRs from sec- 
ondary particles and the Sun, and higher-energy galactic 
CRs (Robborto 20093). Clearly, a reliable method to de- 
tect both low- and high-energy CRs is needed. 

In this article we will discuss CR detection methods 
for infrared data, whic h use nondestructive read ramps. 
lOffenberg et al.l (|2001l ) found that in order to correct for 
CRs, the nondestructive readout scheme was most effi- 
cient. By integrating the charge on each pixel in this 
way it is possible to calculate the slope of these ramps in 
counts per time to get the flux of the sky (Riekc 2007). 

|randers@stsci.edu, kgordon@stsci.edu| 



We refer to the integrated charge as a ramp made up 
of a specified number of samples. We discuss the slope 
of these ramps as the calculation of the flux. Finally, a 
CR affects the ramp between samples, however we use 
'sample number' to refer to the first sample after the CR 
hit. 

C orrecting for CRs in ramps is not a new prob- 
lem (jOffenberg et al.lll999[ ). However, the advent of large 
nondestructive arrays in orbit coupled with ground-based 
processing of the ramps means, when it comes to calcu- 
lating the slope of ramps, that there arc more options 
available for CR detection. In addition to CRs, noise is 
added to the data by the detector and readout electron- 
ics as well (Fanson 1998, Tian et al. 1996, Rieke 2007). 
In this article we focus on photon-noise and read-noise 
for our correlated and uncorrelated noise, respectively; 
however, we are not restricted to just these two terms. 
The work in this article is general and can account for any 
correlated and uncorrelated noise sources. Therefore, our 
question is. What is the best we can do at finding CRs 
in a ramp, given the noise in the ramp? 

To understand and test various CR detection methods 
for ramps, we simulate infrared data as described in Sec- 
tion [2j Anytime a slope is calculated for the preceding 
process or for the CR detection methods, we do so using 
linear regression for data with correlated and random 
uncertainties (in consideration of the photon-noise and 
the read-noise, respectively), as described in Section [31 
In Section |4] we propose three techniques to detect CRs. 
The first is a two-point difference method, the second is 
a deviation from the fit method, and the third is a y- 
intercept comparison. There are various conditions that 
can hinder/aid in CR rejection (e.g., number of samples, 
slope, number of CRs, size of CRs, and location of CRs); 
therefore, we aim to study combinations of these and to 
find which algorithm behaves the best under different 
conditions. These results are presented in Section [5l A 
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Parameters Values 

Slope 70.0 e-/s 

Y-Intercept 21000.0 e" 

Number of Samples 40 

Sample Time 27.7 s 

Read Noise 16.0/\/8 e~ /sample 



Table 1 

Parameters Used to Create MIRI Ramps. 



discussion on our findings is in Section ^ Concluding 
remarks can be found in Section [T] 

2. SIMULATING NONDESTRUCTIVE READ 
RAMPS 

In order to test how well a CR detection method works, 
we have to test the method on a known CR. For that 
reason, we have simulated ramps with photon-noise and 
read-noise added in, to which we can either add CRs 
with known location in the exposure and known am- 
plitudes or leave the ramps clean (CR-free). We build 
these ramps with the slope, y-intercept, number of sam- 
ples, time between samples (sample time), and read- noise 
as inputs, using parameters given in Table [T] as guide- 
lines (slope and number of samples will be changed in 
this study). The only value that is instrument-specific 
is the read-noise, which was chosen to match the ex- 
pected value of the Mid- Infrared Instrument (MIRI) on 
the JWST. Note that we do not group any of the sam- 
ples by taking a weighted average or coadding. Further- 
more, we assume uniform sampling (i.e., constant time 
between samples), as that is what is used by the JWST 
and also so that calculations are more efficient. We also 
assume that a nonlinearity correction has already been 
performed with no error, and we do not account for the 
effects of quantization noise. Although we are simulat- 
ing MIRI detector parameters, this is only an example. 
The methods discussed in this article will apply to any 
other nondestructive read data, including those for the 
JWST near-infrared detectors (Near- Infrared Camera 
[NIRCam] and Near- Infrared Spectrograph [NIRSpec]) 
the HST infrared detectors ( Wide Field Camera 3 - IR 
[WFC3-IR] and Near-Infrared Camera and Multi-Object 
Spectrometer [NICMOS]) and ground-based detectors, 
by simply changing the values in Table [T] If the data 
you would like to simulate do included grouped samples, 
however, some revisions will have to be made. 

At time t = the counts j/o a-rc equal to the y- 
interceptQ This is not considered part of the ramp, be- 
cause t = is when the reset occurred, but wc start 
reading at time equal to one sample time, ti. 

Each signal is comprised of the counts from the pre- 
vious signal plus the additional signal, the photon-noise 
from the additional signal, and the read-noise. To con- 
struct the ramps, wc follow this procedure: 

1. For each ramp calculate the expected signal (i.e., 
no noise, clean), Sg, defined as 

Se=mts, (1) 

^ Although we use the value for the y-intercept given in Table [T] 
this is an arbitrary value and does not change our calculations. 



where m is the input slope, and ts is the sample 
time. 

2. For each sample calculate the sum of the expected 
signal and the unique photon-noise of this signal 
(se +<7„pi) from a Poisson distribution with A = Se- 
Add this to the signal from the previous sample (or 
y-intercept if it is the first sample). This step is 
shown in equation [2j Since we have only added the 
correlated noise, and we still need to add the uncor- 
rclatcd noise, wc have called the signal y* instead 
of Vt). 

y* ^ y*-l + Se + CTn^z- (2) 

3. If there are to be one or more CRs to the ramp, 
simply add the electrons with the expected signal: 

Vi Vi-l +Se+ a-n^i + CRniag, (3) 

where CRmag is the magnitude of the CR. 

4. Read noise, denoted by Ur, is due to readout elec- 
tronics; consequently, it is uncorrelated. Therefore, 
when all samples are populated, add a unique read- 
noise, (Tn^i, to each sample (equation |4]): 

yt = y* + crn^i- (4) 

For each sample an^i is taken from a Gaussian with 
n,. as the standard deviation. 

The uncertainties for the samples in the ramp are cal- 
culated by adding the photon-noise and read-noise in 
quadrature: 



ay^^n^ + nl, (5) 

where Hp is the Poisson noise of the expected signal in the 
ramp; thus. Hp — \Jmts, and Ur is taken from Table [TJ 
We have chosen to give each sample equal weighting, 
rather than choosing the photon-noise to be the Poisson 
noise of the counts in the ramp, so that we can improve 
our results by avoiding a bias based on sample number. 

3. LINEAR FIT ALGORITHM FOR DATA WITH 
CORRELATED ERRORS 

For two of the three CR detection methods, it is 
very important that we use the best calculation of the 
slope and y-intercept . Therefore we must use all in- 
formation inc l uding correlated and uncorrelated errors. 
iGordon et al.l ()2005D explains how to take into account 
both correlated and uncorrelated errors for the slope and 
y-interccpt uncertainties, and here we describe how to 
include this information for the fit as well, using matrix 
notation and specifically making use of the covariance 
matrix. For a refresher on calculating a linear fit using 
matrices as well a s the use of the covariance matrix, see 
iHogg et all (|201Cl[ ). Every time a slope is calculated in 
the CR detection methods described in this article, we 
do so using this method. 

The real power of using matrix notation comes from 
the covariance matrix, C, as it includes information 
about the degree of correlation between samples. Fur- 
thermore, the slope and y-intercept uncertainties result- 
ing from using this covariance matrix automatically take 
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into account the correlated and uncorrelated errors, pro- 
vided they are included in the covariance matrix. C is 
defined as: 



C2,i cr. 



Cl,2 
2 

y2 



■C2M 



cn.i CN.2 ■ 



(6) 



The values on the diagonal of C are the uncertain- 
ties of the Ui, whereas the off-diagonal elements are the 
covariance between the different y^. If the data are un- 
correlated, then the off-diagonal matrix elements are all 
zero. Ci,j = 0,i ^ j. However, we do have correlated 
data due to the photon-noise. 

When co nstructing C for cor related errors, we follow 
the logic of iFixsen et al.l ()2000[ ) and think of C as the 
sum of two matrices: one fo r the photon-noise, P, and 
one for the read-noise, R. IFixsen et al.] (pOOO ) define 
these matrices for the case where the data are the two- 
point difference of the samples (the photon-noise is not 
correlated, and the read-noise is correlated). However, 
for fitting a line to the individual samples, it is just the 
opposite. The read-noise is uncorrelated, and thus R is 
just on the diagonal and zero elsewhere. 

The diagonal of P is n^, just as it would be for uncor- 
related data, but since the photon-noise is correlated, the 
off-diagonal elements are not zero. An estimate of all of 
the photon-noise in jji is the photon-noise added to each 
sample multiplied by i (i.e., iup). The correlation be- 
tween samples yi and yj is the photon-noise that is in yi 
that is also in yj. The elements in P that represent this 
correlation are pij and pj^i. Consider j/2 and ya, where 
the photon-noise they share would be the photon-noise 
in 2/2 (2?^p), plus the photon-noise in the first read, pi^i- 
Thus, 



kn +_pi,i. 



where 



k = 



U, 



J < I 
i < j' 



(7) 



(8) 



If pi^i is included in the background, then it is set to 
zero. We can estimate Up using the slope as we did in 
Section [2] rip = \/s^- This initial calculation of the 
slope for Se (eq. [T|) is done before any CR correction and 
without taking into account correlated errors. 

This technique accounts for the diagonal elements as 
well. Therefore, 



+ Pi,i + n;i =j, 



(9) 



where is the read-noise. 

An equation for the noise in a ramp has previously b een 
derived in nonmatrix form in iRauscher et al.l 120071 : for 
an independent derivation and t he corr ect final formula, 
see ([Robberto 200 9b). Rausch cr et al.l formula (eq. (1) 
in that article, see erratum) has the benefit that it takes 
into account grouped samples. However, the benefit of 
using matrix notation with a covariance matrix is that 
we can add other noise terms easily by calculating the 
appropriate covariance matrix (like we did for P and R) 



and sum all to get C. I t would be more diffic ult to add 
other noise terms to the IRauscher et akl (120071 ) equation, 
and like our method here, they only include read-noise 
and photon-noise. 

3.1. Validating with Simulations 

To demonstrate how well this calculation of the un- 
certainties fi ts simulated data, we followed the idea from 
Figure 16 in iGordon et all ()2005D and simulated 10,000 
ramps each with read-noise only, photon-noise only, and 
both, then we used the covariance matrix to calculate 
the y-intercept and slope uncertainties. This is given in 
Figured] The dashed and dotted lines are the uncertain- 
ties with read-noise and photon-noise only, respectively. 
The solid lines are the uncertainties in the ramps where 
both read-noise and photon-noise were added. Notice 
that you can see where the transition is between the read- 
noise-dominated regime and the photon-noise-dominated 
regime using these plots. The circles are the uncertain- 
ties calculated using the method described in Section |31 
which fit the data perfectly. 

In addition to using a covariance matrix, other popular 
methods to fitting a line to ramps have used uncertain- 
ties as the weights (1/ g^), equal weights, or optimum 
weights, as discussed in IFixsen et al.l (|2000f ). To demon- 
strate how these methods compare, we simulated 10,000 
ramps, each with the same input slope (before the noise 
was added) and then tried to retrieve these slopes using 
the various weighting schemes (Figure [2]). The standard 
deviation in the slope fits is plotted in Figure [5] (a). We 
see that the covariance matrix and optimal weighting re- 
sult in a higher signal-to- noise ratio (S/N) than the other 
two methods. Optimal weighting estimates a covariance 
matrix, so it is expected that they show similar results. 
In Figure [2] (b) we show the ratio of the average cal- 
culated slope to the input slope, with unity subtracted. 
Note that the curve in this plot is nonsmooth due to 
the finite number of trials. The error on the slope cal- 
culation is very small, with the calculated slope at most 
0.004% from the input slope. For various slopes we made 
a histogram of the distribution of these ratios and found 
that they were symmetric and centered on the average. 
Therefore, optimal weighting and using a covariance ma- 
trix with correlated and uncorrelated errors produces a 
lower standard deviation of the results, while all methods 
produce a similar slope estimate with 10,000 trials. 

4. CR DETECTION METHODS 

All three CR detection methods have a unique algo- 
rithm. However, all follow the same procedure to cal- 
culate a slope once the CRs are found, which we now 
outline: 

1. Detect CRs, always looking for the largest outliers 
first. 

2. Calculate the slope for the resulting ramp seg- 
ments before and after the CR events (we will refer 
to these as "se miramps" from here on, following 
IR,obbe7tol[2008h . 

3. Calculate the final slope of the entire ramp. If there 
is one CR or more, do this by taking the weighted 
average of the slopes of the semiramps (see also the 
discussion in Section |6]). 
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Slope [e-/s] Slope [e-/s] 

Figure 1. Calculatio n of the uncertainties for the y-intercept (a) and the slope (b) using the method described in Section |3] Following the 
idea from Figure 16 in lGordon et al] 1 )20051 ). for 10,000 each, we generated ramps with read-noisc only, photon-noise only, and then both. 
The uncertainties in the y-intercept and slope given by the data are plotted by the dashed, dotted, and solid lines. Then using correlated 
and uncorrelatcd components in the covariance matrix (e.g., including both read- noise and photon- noise) we calculated the corresponding 
uncertainties shown by the circles. These calculations match the data perfectly. Furthermore, this plot also shows where the transition is 
between the read-noise-dominated regime and the photon-noise-dominated regime. 
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Figure 2. (a) Standard deviation of slope fits versus input slope, 
for various weighting schemes to point out that using the covari- 
ance matrix has lower noise (though it is very similar to optimum 
weighting, as optimum weighting is an estimate of the covariance 
matrix), (b) Calculated slope to input slope ratio minus one for 
the same methods. All methods show the same bias; therefore, the 
difference lies only in the S/N shown in (a). 

Furthermore, for all methods we use the absolute dif- 
ference so that we remove outliers in both directions and 
do not bias the data jOffenberg ct al. 1999, 2001). How- 
ever, if larger rejection thresholds are used, and therefore 
there are no longer as many outliers (i.e., only picking up 
the stronges t CRs). then one-sided clipping would work 
just as well (jWindhorst et al.lll994) . 

4.1. Two-Point Difference Method 

With the nondestructive readout method, algorithms 
for CR detection include comp uting the two-point differ- 
ences. lOffenberg et~all ([1999 ) and iFixsen et all (pOOOl ) 
describe versions of this method, though both use only 
one rejection threshold for all cases (Scr and 4.5(j, re- 
spectively). Another version was used in the Multiband 
Imaging Ph otometer (MIPS) d ata reduction algorithm 
for Spitzer (|Gordon et al.|[2005h . We describe a variant 
of this technique here. 



Figure 3. 2PT DIFF method, (a) Solid line and points are the 
data, yi, and the dotted line is the two-point diff eren ces, di, of this 
data, (b) Dashed line is the ratio in equation 1101 which will be 
compared with the rejection threshold, rt. The vertical line shows 
the interval where the CR hit. 

For the two-point difference method (hereafter 2pt 
diff) , we calculate the two-point differences between the 
counts in each set of adjacent samples. The largest out- 
lier is flagged as a CR, given that it fulfills the rejection 
criteria: 

> (10) 

where di is the difference between the science data 
and yi, [i^, is the median of d, ad is the uncertainty of 
d, and rt is the rejection threshold. The median is used 
because it is more ro bust than the mean when there are 
outlie rs in the data () Press et al.l 119861 : lOffenberg et al.l 
Il999f ). The median can become a problem if there is 
quantization noise (i.e., if the slope of the ramp were 0.07 
e~/s), however we do not take into account quantization 
noise in this article. Once a CR is identified, the di that 
includes the CR is removed, and the process is repeated 
on the remaining di until no more CRs are found. This 
method is depicted in Figure [3] 
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When calculating the uncertainty in d, ad, the most 
obvious solution would be to use the standard deviation. 
However we have found that this does not work well for 
ramps with a low number of samples 5). We can im- 
prove ad by using the photon-noise and read-noise added 
in quadrature. The photon-noise can be calculated as 
Poisson noise, but since we are dealing with the two- 
point difference we use the charge accumulated since the 
last sample, rather than the total charge in a sample, so 
we use Since a CR can contaminate a slope calcula- 
tion, we can use /x^ to estimate s^. Therefore Hp = ^/Jld, 
and 



2nl 



(11) 



where the factor 2 is due to the read-noise from each of 
the two samples. 

4.2. Deviation from the Fit Method 

The method we will refer to as the deviation from the 
fit method (h ereafter DEV from fit), is the o ne used 
by N ICMOS (iDahlen et all 120081) and WFC3 (iDressell 
[20T0h . To use this method we fit a line to the ramp using 
a covariance matrix as described in Section [3l Then, for 
each sample, we calculate the difference to the fit as a 
ratio to the uncertainty in the counts: 



devj 



Vi - fi 



(12) 



where fi is the fit at sample y^, and Oj. is the uncertainty 
in each sample, defined in equation [5] 

We then take the first difference of these ratios, and 
look for the largest. If it satisfies the criteria: 



\devi+i — devi\ > rt, 



(13) 



then yi+i is flagged as a CR. The ramp is then split 
into semiramps, and this method is applied again to the 
resulting semiramps. 

The DEV FROM FIT method is iUustratcd in Figure SI 
Note that the background level is not at zero as it was 
for the 2pt DIFF method, but 10. The reason for 
the nonzero background is that while the median (which 
would exclude the CR) is subtracted in the 2pt diff 
method, in the dev from fit method the fit is sub- 
tracted, which includes the CR in its calculation. Fur- 
thermore, in the presence of a CR, the slope will be over- 
estimated and, therefore so will the photon-noise. Fi- 
nally, the peak is at a ratio of ~60, whereas both the 
2pt diff and y-int methods (as we wih see) peak at 
ratios of ^80. 

4.3. Y-Intercept Method 

The idea for the y-intercept method (hereafter Y- 
int) comes from the M IPS data reduction algo- 
rithm (jCordon et al.ll2005[ ). The details we describe here 
are a variant of that method. For the y-int method, 
wc step through each sample and assume that there is 
a CR there (the first sample is skipped). We fit a line 
to the semiramps before and after the sample with the 
assumed CR, using a covariance matrix as described in 
Section [3] At each sample, the x-axis is shifted so that 
the y-intcrcept is located at the sample number of the as- 
sumed CR. The only exception is when we assume that 




2 4 6 

Sample Number 

Figure 4. DEV FROM FIT method. The soUd Unc and points are 
the data, yi and the dotted line is the linear fit made up of points, 
fi . In the lower panel the dotted line is the ratio of the difference 
between the fit and the data to the uncertainty, devi . The dashed 
line is the two-point differences of the deVi. This is what will be 
compared with the rejection threshold, rt. The vertical line shows 
the interval where the CR hit. 

there is a CR in the second sample. Since we cannot 
calculate a y-interccpt for only one point, we shift the 
x-axis to the first sample, and use the counts in that 
sample as the y-intercept, and take the read-noise as the 
uncertainty. We do the mirror to this when assuming 
that there is a CR in the last sample. Then, we take 
the ratio of the absolute differences between the two y- 
intercepts (6i and 62) to the expected uncertainty, ab 
(equation [n)) . 



> n 



(14) 



After stepping through every sample, we look at the sam- 
ple with the largest ratio. If this ratio is larger than a 
given rt , then we flag that sample as a CR. The process 
is repeated on the resulting semiramps. The method is 
depicted in Figure [5l 

The expected uncertainty, a^, is calculated from the 
y-intercept uncertainty from the read-noise and the y- 
intercept uncertainty from the photon-noise added in 
quadrature. There are two read-noise components, 
and 71^2 , one from the ramp before the assumed CR, and 
one from the ramp after. These read-noise terms are 
calculated by setting the covariance matrix, C, to the 
read-noise matrix, R, and recalculating the y-intercept 
uncertainty. The photon-noise component is calculated 
as Up = Since in the Y-int method we are deal- 

ing with two ramps, we take the weighted average of the 
slopes to get Se, just as we would if there were actually 
a CR found at the assumed location. Therefore, 



(15) 



In Figure[S](c) the dashed line representing equationlTH 
is hovering around zero before the CR detection, and 
then afterward it increases to '--^lO. This effect comes 
from the difference between the y-intercepts. It is caused 
by the fact that we shift the x-axis to the sample with the 
assumed CR, whereas if we shift it to the sample before 
the CR (as we do for a CR in the second sample, notice 
that there the ratio is '-^lO), then the difference between 
61 and 62 before the actual CR would be greater than 
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Sample Number 



Figure 5. Y-INT method, (a) Counts and two linear fits as it 
would look when assuming that the CR is in sample 3. The solid 
line and points are the data, and the dotted lines are the linear 
fits to the semiramp before and after the assumed CR. The two 
y-intercepts, bi and b2 are highlighted with dots and labeled, (b) 
Assuming the CR is in sample 6 (which indeed it is), (c) Results 
after stepping through the entire ramp assuming a CR is in each 
sample. Plotted is the ratio of the absolute difference between the 
y-interccpts to the uncertainty, which must be above the rejection 
threshold, rt, to be counted as a CR. The interval where the CR 
hit is shown by the vertical line. 



after the actual CR. 



5. RESULTS 



To compare the success of each method at finding CRs, 
we look at both the fraction of CRs found and the false 
detection rate, which we define as the ratio of the num- 
ber of false detections to the number of possible false 
detections. The number of possible false detections is 
calculated as the number of samples minus one, minus 
the number of true CRs, times the number of trials. 

We look at simulated ramps with 5, 10, 20, 30, and 
40 samples, with CRs of 20 different magnitudes from 
to 250 e~ (all CRs with magnitudes >250 e~ are found 
with all methods), and with slopes of 0.0 e~/s, 0.7 e~/s, 
3.5 e^/s, 7.0 e~/s, 35.0 e~/s, and 70.0 e~/s, with one, 
two, or three CRs, and with the CR located in the be- 
ginning, middle, and end of the ramp (e.g. for a ramp 
with 40 samples, we inserted a CR in the 3'''', 10*'', 20*'', 
30*'', and the 35*'' sample). Wc then simulate 10,000 
ramps for each combination, and apply each method to 
each ramp to compare the results. For all trials, the re- 
jection threshold, rt, is chosen such that the rate of false 
detections is the same for all methods (a rate of 5% was 
chosen). In this way we can best compare how well they 
find the CRs. 

5.1. The Rejection Threshold, rt 

Each of the CR detection methods has a different cri- 
terion (see equations Uni [El andfT4|). which, when com- 
pared with the rejection threshold, decides if a sample 
is to be flagged as containing a CR or not. Therefore, 
the relationship between rejection threshold and the false 
detection rate is different for each method. To better un- 
derstand this dependence, for each CR rejection method 
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Figure 6. Fraction of false detections versus rejection threshold is 
plotted here for various input slopes, (a) 2pt: DlFF method. (b)DEV 
FROM FIT method, (c) Y-iNT method. 

we used a range of rejection thresholds and looked at the 
resulting fraction of false detections. This was done on 
CR-free ramps. 

The first comparison is for the fraction of false detec- 
tions versus the rejection threshold for different input 
slopes (Figure [6|). For a given fraction of false detec- 
tions, the rejection threshold is independent of slope for 
the 2pt diff method, while there is very little change 
for the Y-iNT method. However, for the DEV from fit 
method, the rejection threshold needs to vary to keep the 
fraction of false detections constant across ramp slopes. 

The rejection threshold for different number of samples 
in a ramp is shown in Figure [T] An input slope of 70.0 
e~/s was used, and the rejection threshold was chosen 
such that the fraction of false detections was 0.05. The 
slope of these lines is 0.0011, 0.0065, and -0.0049 for the 
2pt diff, DEV FROM FIT, and Y-INT methods respec- 
tively. Although this is a small change for the 2pt diff 
method, if a CR is found in a ramp then the rejection 
threshold was changed depending on the number of sam- 
ples in a semi-ramp. If the rejection threshold was not 
calculated for a specific number of samples, we interpo- 
lated it. Figure [7] also shows how well the uncertainties 
in the 2pt diff method behave like a Gaussian, thus 
allowing the rejection threshold to be chosen easily. 

5.2. Slope and CR Detection 

As we change the input slope of these ramps, we move 
from the read-noise-dominated regime to the photon- 
noise-dominated regime. In Figure [8] we show the two 
extremes: read-noise-dominated regime with an input 
slope of e~/s (Figure[5](b) and (d)), and photon-noise- 
dominated regime with an input slope of 70 e~/s (Fig- 
ure [8] (a) and (c)). Plotted is the fraction of CRs found 
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Figure 7. Rejection threshold that will result in a fraction of false 
detections of 0.05 is shown for various sample numbers. What we 
would expect from a Gaussian distribution is also plotted to show 
how similar it is to the 2PT DIFF method. A slope of 70.0 e~ /s was 
used for this figure. 

versus the CR magnitude and the fraction of false detec- 
tions versus the CR magnitude. 

In the photon-noise-dominated regime, photon-noise 
can appear as a CR as it is correlated and thus can lead to 
false detections. From Figure[8](a) and (c) we see that all 
of the methods give similar results. However, note how 
the fraction of false detections for the 2pt DIFF method 
is relatively constant compared with the other two meth- 
ods. This shows how well we are able to understand the 
uncertainties of this method. This, coupled with the fact 
that the 2pt diff method requires the least computa- 
tions, leads us to suggest that the 2pt diff method is 
the optimal CR detection method in the photon-noise- 
dominated regime. 

In the read-noise-dominated regime we only have ran- 
dom noise, and therefore we are able to get a more ac- 
curate calculation of the y-intercept and the y-intercept 
uncertainties. Therefore, as is shown in Figure |8] (b) 
and (d), in the read-noise-dominated regime the y-int 
method gives the best results in that it is able to find 
fainter CRs than the other two methods. 

In Figure [8] (d), note that the rate of false detec- 
tions per CR strength is only constant for the 2pt diff 
method, while for the y-int method it is constant only 
for high-energy CRs. Between a CRmag of e~ and 10 
e~ , we see a high fraction of false detections for the y-int 
method. This corresponds to the range where the CRs 
are not always found. The cause of this 'bump' is not 
clear, and is evidence that we do not fully understand 
the noise model associated with the Y-INT method. 

This could be caused by the fact that wc treat each 
scmiramp independently, which is inaccurate, since the 
uncertainties are correlated. Therefore, one solution 
could be to fit all semiramps simultaneously. This would 
increase the uncertainty, which would mean that the re- 
jection threshold has been set too low (which would agree 
with what we see in the plots). We leave that for future 
work. 

Regarding Figure |8] (a) and (b) , if you draw a verti- 
cal line at a given CRmag (we chose the CRmag where 
the fraction of CRs found by the 2pt diff method is 
closest to 50%), you can see the difference between the 
fraction of CRs found by each method at that slope. We 



did this for slopes of O.Oe^/s, 0.7e~/s, 3.5e~/s. 7.0e^/s, 
35.0e~/s. and 70.0e~/s. The results are shown in Fig- 
ure [9] Each ramp had 40 samples, and again the rejec- 
tion threshold was chosen such that the fraction of false 
detections was 5%. For comparison, we subtracted the 
fraction of CRs found by the 2pt diff method. In this 
figure we can see that we move into the regime where 
the Y-INT method performs better than the 2pt diff 
and DEV from fit methods at a slope of ~ 5e^/s. 

To further illustrate how the results of each method 
change with input slope, we show the fraction of CRs 
found as a function of CR magnitude for different slopes 
and for each of our methods in Figure (TUl The results of 
the 2pt diff and dev from fit methods appear to be 
similar, while the y-int method does better with lower 
slopes (read- noise-dominated regime). 

5.3. Number of Samples and CR Detection 

In order to determine how the number of samples in 
a ramp affects how well we detect CRs, we tested each 
algorithm on simulated ramps with 5, 10, 20, 30, and 
40 samples. Figure [TT] shows the fraction of CRs found 
as a function of CR magnitude for different numbers of 
samples for the 2pt diff, dev from fit, and y-int 
methods. 

When we use the 2pt diff method on weak CRs (< 50 
e~) and strong CRs (> 150 e~), the fraction of CRs found 
increases with the number of samples (except for the case 
where there are five samples). For the dev from fit 
method, the fraction of CRs found increases with number 
of samples for string CRs, but shows no difference for 
weak CRs. Finally, for the y-int method, the fraction 
of CRs found decreases with number of samples for weak 
CRs, but increases with number of samples for strong 
CRs. 

5.4. CR Sample Number and CR Detection 

There are two questions when it comes to CR sample 
number: the first is where do we find false detections in 
the ramp, and the second is how well we find CRs in 
given positions in the ramp. 

To answer the first question, we created a histogram 
of the sample numbers of the false detections found on 
simulated CR-free ramps for each method. These are 
presented in Figure [121 The results show that both the 
2pt diff and dev from fit methods are not biased 
toward any position in the ramp. However, the y-int 
method is biased toward the last samples in a ramp where 
there is less effect from the correlation between samples. 

To answer the second question, we simulated ramps 
with 40 samples and input slope of 70 e~/s, with one 
CR each at sample (numbers 3, 10, 20, 30, or 35). We 
then applied each of the algorithms to these ramps and 
compared the results, shown in Figure [T51 

The results of these plots all show that the 2pt diff 
is the only method that does not vary depending on the 
sample number of the CR, as can be expected. Both the 
DEV from fit and Y-iNT methods are weakly biased by 
sample number. This is also expected, since both require 
fitting lines to data, and the fewer points in a line the less 
accurate the fit. Again, in the photon- noise-dominated 
regime, the 2pt diff method is best. 

5.5. Multiple Cosmic Rays and CR Detection 
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Figure 8. CR detection rate and false detection rate as a function of CR magnitude for all three methods. In (a) and (c) the input slope 
was 70 e^/s, in the photon-noise-dominated regime, where we recommend the 2PT DIFF method be used. In (b) and (d) the input slope 
was e^/s, in the read-noise-dominated regime, where the Y-INT method outperformed the other two. Each ramp has 40 samples. 
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Figure 9. Difference between the fraction of CRs found by the 
DEV FROM FIT and Y-iNT methods to the 2PT DIFF method. From 
this plot we are able to see that the Y-iNT method outperforms the 
other two with slopes less than ~ 5e~/s. 



In a 1000 s integration, if we assume that 20% of the 
field will be affected by CRs, then about 4% of the field 
could be affected by two CRs, and 0.8% could be affected 
by three. This is substantial enough that we need to 
account for this possibility. 

To test how well each method does at handling multiple 
CRs, we simulated 10,000 40-sample ramps for each of 
0-19 CRs. Each CR was of the same strength and in 
random but different samples. We show the results for 



0-3 CRs in Figure [H 

As can be seen, the 2pt diff method shows no differ- 
ence between one, two, or three CRs. We show only up 
to three CRs for simplicity, but we have found that the 
2pt DIFF method does not start to show a difference in 
results until about nine CRs and therefore should be the 
optimal method in the photon-noise-dominated regime. 
Both the DEV from fit and y-int methods, although 
show more variance than the 2pt diff method. This is 
caused by the fact that both methods require linear fits 
that will include two CRs for the y-int method (when 
assuming the correct location of one of the CRs) , and all 
three CRs for the DEV FROM fit method. Furthermore, 
for both methods the ramp will be segmented after the 
detection of the first CR. 

6. DISCUSSION 

The three CR detection methods presented in this arti- 
cle were tested on simulated ramps, adjusted, corrected, 
and refined, in order to present optimum versions. We 
now compare our methods to determine which is best 
suited for various data. 

6.1. 2-Point Difference Method 

Our results show that the 2pt diff method performs 
best in the photon-noise-dominated regime. On top of 
being straightforward and computationally simple, we 
showed that the fraction of false detections versus rejec- 
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Figure 10. CR detection rate as a function of CR magnitude for 
all methods. Eacii line is for a different input slope. There are 40 
samples in each ramp. 
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Figure 11. CR detection rate as a function of CR. magnitude for 
all methods. Each line is for a different number of samples in a 
ramp. The input slope is 70 e~/s for each ramp. 
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Figure 12. Histograms of the sample numbers of the false detec- 
tions for each method. These arc from ramps without a simulated 
CR. 



1.0 

0.8 

0.6 

0.4 

0.2 

0.0 
1.0 

J 

I 0.8 

: 0.6 

) 

;o.4 

! 0.2 

0.0 
1.0 

0.8 

0.6 

0.4 

0.2 

0.0 



(a) 2 Pt. Diff. 



— CR Sample* 


= 3 


-- CR Sample* 


= 10 


CR Sample* 


= 20 


■■■ CR Sample* 


= 30 


♦ * CR Sample* 


= 35 



(b) Dev. from Fit 




100 150 
CJi™, [e-] 



Figure 13. CR detection rate as a function of CR magnitude for 
all methods. Each line is for a different CR sample number. The 
input slope is 70 e~/s, and there are 40 samples in each ramp. 
The 2pt diff method is the only one to not show a dependence on 
sample number. 



tion threshold is consistent regardless of slope and num- 
ber of samples. We also showed that the uncertainties 
of the 2pt diff method follow a Gaussian distribution. 
Any deviation from a Gaussian was initially thought to 
be caused by the use of the median instead of average 
in the rejection criterion. However we found that while 
using the average did produce a more Gaussian shape, 
it was still not a perfect Gaussian. It was demonstrated 
that, as expected, the fraction of CRs found changed 
with slope and number of samples (though barely) . Fur- 
thermore, we showed that the false detections are evenly 
distributed in all samples, and that the CR sample num- 
ber did not change the results. Finally, we found that 
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Figure 14. CR detection rate as a function of CR magnitude. 
Each line is for a different number of CRs in a ramp. 

there is no noticeable difference in the performance of 
the 2pt diff method with muhiple CRs, up to 9 CRs 
on a ramp with 40 samples. 

Overall, the 2pt diff method is fast, consistent, and 
easy to understand and calibrate (e.g., choose a rejection 
threshold), and it gives the best results in the photon- 
noise-dominated regime. 

6.2. Y- Intercept Method 
For the y-int method, there are two regimes: 

1. Photon- noise-dominated: In this regime the y-int 
method gives the same answer as the 2pt diff 
method. This is due to the correlated behavior 
of the noise in the ramps (e.g., a Scr event due to 
the noise of the photons that have arrived in the 
last sample time will propagate through all sub- 
sequent samples, having the same effect as a CR). 
Calculating the linear fit of subsequent samples will 
not negate the fact that the noise in the photon- 
dominated regime is set by the noise in a single 
sample. 

2. Read noise-dominated: In this regime the y-int 
method is better than the 2pt diff method. These 
two methods will still give the same results on any 
noise above the 2pt diff threshold, but anything 
below that will only be detectable by the y-int 
method. With the noise independent from sample 
to sample, a linear fit reduces the uncertainty in 
the y-intercepts and weaker CRs are detectable. 

While it is unlikely that the uncertainty will be read- 
noisc-dominatcd for most MIRI data (given the tele- 
scope and sky background), it will be the case for the 



JWST NIRSpec, NIRCam, and the Tunable Filter Im- 
ager (TFI). NIRSpec has a higher resolution than the 
TFI and therefore a lower background. Regardless, 
the TFI will likely be read-noise limited, given that it 
uses a tunable filter (narrowbands arc less sky /telescope- 
dominated) . 

The Y-INT method, however, does not quite match the 
robustness of the 2pt diff method. In theory, these 
two methods should be the same in the photon-noise- 
dominated regime. However, the 2pt diff method gave 
more consistent results for different slopes, number of 
samples, and CR location in the ramps and for multiple 
CRs. The difference is that the 2pt diff method has a 
simpler noise behavior, while we have not fully solved for 
the noise behavior for the Y-int method. Despite being 
careful when dealing with correlated and uncorrclated 
errors when fitting a line to a ramp or semiramp, we did 
not take this into consideration when taking the average 
of the slopes of the semiramps. Instead we use a weighted 
average, where the weights are simply the uncertainties 
in the slope. A solution to this problem would be to fit 
each semiramp and the CR simultaneously, as done by 
Eobberto (200^ for uncorrclated errors only, modifying 
the method to account for correlated errors. Regardless, 
when working in the read-noisc-dominated regime, the 
Y-INT method should be used. 

6.3. Deviation from Fit Method 

The DEV FROM FIT method did not perform as weU 
as the other two methods. The Y-int method domi- 
nated in the read-noise regime, and in the photon-noise- 
dominatcd regime the dev from fit method, like the 
Y-INT method, was not as consistent as the 2pt diff 
method when it came to different slopes, number of sam- 
ples, CR location in the ramps, and multiple CRs. There- 
fore, to have a consistent fraction of false detections, we 
would have to set a new rejection threshold based on 
these variables. Furthermore, while not as complex at 
the Y-INT method, the dev from fit method is stiff 
more complex than the 2pt diff method and thus com- 
putationally more expensive. Finally, the uncertainties 
arc nowhere near Gaussian, thus making it difficult to 
set the rejection threshold. The dev from fit method 
was only slightly worse at detecting CRs than the 2pt 
DIFF method, but given the fact that it is not as robust, 
and requires more CPU time, the dev from fit method 
is not recommended. 

The DEV FROM fit method could be improved by 
changing the weighting scheme so that instead of using 
the slope in the calculation of the photon-noise (since 
the slope will still include the CR in its calculation), we 
could use the median of the two-point differences as we 
do for the 2pt diff method. However, it still would not 
compare with the 2pt diff method, as we cannot do 
better than twon,. terms and one rip term by the nature 
of our problem. Furthermore, even if we were able to get 
the same results for these two methods, the dev from 
FIT method still requires larger computational resources 
than the 2pt diff method. 

7. CONCLUSIONS 

In this article we discussed three CR detection methods 
for nondestructive read ramps, as well as their strengths 
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and weaknesses. We applied these methods to simulated 
ramps with single-sample groups. We showed that the 
covariance matrix must include correlated errors in order 
to improve the calculation of the slope and y-intercept 
of a ramp and their uncertainties. The y-int method 
benefits most from this use of the covariance matrix, and 
would not be an improvement of the 2pt diff method 
otherwise. 

The 2pt diff method was shown to be able to give the 
best results in the photon-noise-dominated regime. The 
method's uncertainties are quasi-Gaussian which simpli- 
fies the process of choosing a rejection threshold. More- 
over, it is fast and consistent. Its robustness, compared 
with the other two methods, resides in the fact that we 
simply remove the two-point difference that includes the 
CR, instead of splitting the ramp into two semiramps, 
when searching for other CRs and calculating the slope 
and the y-intercept. 

The DEV FROM FIT method results in a similar fraction 
of CRs found as the 2pt diff method for some ramps, 
but unlike the 2pt diff method these results change 
based on slope, number of samples, and CR location in 
the ramp and for multiple CRs. If we also consider the 
fact that it is computationally more expensive, we do not 
recommend this method for use in any regime. 

The Y-iNT method achieves the best results in the 
read-noise-dominated regime, and returns the same re- 
sults as the 2pt diff method in photon-noise-dominated 
regime. In the photon-noise-dominated regime, the Y- 
INT method behaves like the 2pt diff method in that 
it is effectively calculating an average slope, excluding 
disturbances from any CRs (in the diff method this is 
done by taking the median of the two-point differences). 
The average slope is then divided by the noise due to 
one sample time: one photon-noise and two read-noise. 
The Y-iNT method is better in the read-noise-dominatcd 
regime as the uncertainties on the line fit parameters di- 
minish in size as more points are fit. This is not the case 
in the photon-noise-dominatcd regime due to the corre- 
lated nature of the noise. The y-int method only has 
two drawbacks: the noise model is not fully understood, 
and it is a complex algorithm. 

In summary, if we take all results into consideration we 
are led to suggest that if only one method can be used 
on data cubes, the y-INT method should be it. If com- 
putational speed is an issue then the 2pt diff method 
should be used, especially where photon-noise dominates, 
and the Y-int method should only be used when the es- 
timated slope is in the read-noise-dominated regime or 
where there is information, such as a nearby CR and 
cross-talk is suspected, to look for fainter effects. 
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